Search CORE

3 research outputs found

QCBA: Postoptimization of Quantitative Attributes in Classifiers based on Association Rules

Author: Kliegr Tomas
Publication venue
Publication date: 18/10/2019
Field of study

The need to prediscretize numeric attributes before they can be used in association rule learning is a source of inefficiencies in the resulting classifier. This paper describes several new rule tuning steps aiming to recover information lost in the discretization of numeric (quantitative) attributes, and a new rule pruning strategy, which further reduces the size of the classification models. We demonstrate the effectiveness of the proposed methods on postoptimization of models generated by three state-of-the-art association rule classification algorithms: Classification based on Associations (Liu, 1998), Interpretable Decision Sets (Lakkaraju et al, 2016), and Scalable Bayesian Rule Lists (Yang, 2017). Benchmarks on 22 datasets from the UCI repository show that the postoptimized models are consistently smaller -- typically by about 50% -- and have better classification performance on most datasets

arXiv.org e-Print Archive

EFFECT OF COGNITIVE BIASES ON HUMAN UNDERSTANDING OF RULE-BASED MACHINE LEARNING MODELS

Author: Kliegr Tomas
Publication venue: 'Queen Mary University of London'
Publication date: 29/01/2018
Field of study

PhDThis thesis investigates to what extent do cognitive biases a ect human understanding of interpretable machine learning models, in particular of rules discovered from data. Twenty cognitive biases (illusions, e ects) are analysed in detail, including identi cation of possibly e ective debiasing techniques that can be adopted by designers of machine learning algorithms and software. This qualitative research is complemented by multiple experiments aimed to verify, whether, and to what extent, do selected cognitive biases in uence human understanding of actual rule learning results. Two experiments were performed, one focused on eliciting plausibility judgments for pairs of inductively learned rules, second experiment involved replication of the Linda experiment with crowdsourcing and two of its modi cations. Altogether nearly 3.000 human judgments were collected. We obtained empirical evidence for the insensitivity to sample size e ect. There is also limited evidence for the disjunction fallacy, misunderstanding of and , weak evidence e ect and availability heuristic. While there seems no universal approach for eliminating all the identi ed cognitive biases, it follows from our analysis that the e ect of many biases can be ameliorated by making rule-based models more concise. To this end, in the second part of thesis we propose a novel machine learning framework which postprocesses rules on the output of the seminal association rule classi cation algorithm CBA [Liu et al, 1998]. The framework uses original undiscretized numerical attributes to optimize the discovered association rules, re ning the boundaries of literals in the antecedent of the rules produced by CBA. Some rules as well as literals from the rules can consequently be removed, which makes the resulting classi er smaller. Benchmark of our approach on 22 UCI datasets shows average 53% decrease in the total size of the model as measured by the total number of conditions in all rules. Model accuracy remains on the same level as for CBA

Queen Mary Research Online